Optical Character Recognition of Non-flat Small Documents Using Android: A Case Study
نویسندگان
چکیده
Optical Character Recognition (OCR) using Android has been received a great attention. In all these efforts a flat document is selected for OCRing and non-flat documents are ignored. Labels mounted on cylindrical surfaces such as wine bottle, pill box, cans, etc, are examples of non-flat documents. The goal of this research effort is to perform optical character recognition on a non-flat document. To be more specific, we investigate OCRing of a cylindrical pill box label. Two pictures of the label are needed for the OCRing. The methodology was applied on 30 synthesized non-flat labels and on average, the system was able to accurately recognize 92.4% of the characters.
منابع مشابه
Noise-tolerance feasibility for restricted-domain Information Retrieval systems
Information Retrieval systems normally have to work with rather heterogeneous sources, such as Web sites or documents from Optical Character Recognition tools. The correct conversion of these sources into flat text files is not a trivial task since noise may easily be introduced as a result of spelling or typeset errors. Interestingly, this is not a great drawback when the size of the corpus is...
متن کاملIdentification of Alphanumeric pattern using Android
The “Identification of Alphanumeric pattern using Android” is a smart phone apps using Android platform and combines the functionality of Optical Character Recognition and identification of alphanumeric pattern and after processing, data is stored in server. This paper present, to design an apps using the Android SDK that will enable the Identification of Alphanumeric pattern using optical char...
متن کاملReagent Label Text Detection Using the Stroke Width Transfom
We implemented an algorithm to extract text from reagent labels to facilitate the retrieval of safety information in a laboratory using an Android mobile device. Our algorithm combined the stroke width transform and a variety of connected component filters to detect text candidates. These were then processed using Tesseract, an open-source optical character recognition engine. We concluded that...
متن کاملOCRdroid: A Framework to Digitize Text Using Mobile Phones
As demand grows for mobile phone applications, research in optical character recognition, a technology well developed for scanned documents, is shifting focus to the recognition of text embedded in digital photographs. In this paper, we present OCRdroid, a generic framework for developing OCR-based applications on mobile phones. OCRdroid combines a light-weight image preprocessing suite install...
متن کاملOCRdroid: A Framework to Digitize Text on Smart Phones
As demand grows for mobile phone applications, research in optical character recognition, a technology well developed for scanned documents, is shifting focus to the recognition of text embedded in digital photographs. In this paper, we present OCRdroid, a generic framework for developing OCR-based applications on mobile phones. OCRdroid combines a lightweight image preprocessing suite installe...
متن کامل